Genetic architecture of end-use quality traits in soft white winter wheat

Background Genetic improvement of end-use quality is an important objective in wheat breeding programs to meet the requirements of grain markets, millers, and bakers. However, end-use quality phenotyping is expensive and laborious thus, testing is often delayed until advanced generations. To better understand the underlying genetic architecture of end-use quality traits, we investigated the phenotypic and genotypic structure of 14 end-use quality traits in 672 advanced soft white winter wheat breeding lines and cultivars adapted to the Pacific Northwest region of the United States. Results This collection of germplasm had continuous distributions for the 14 end-use quality traits with industrially significant differences for all traits. The breeding lines and cultivars were genotyped using genotyping-by-sequencing and 40,518 SNP markers were used for association mapping (GWAS). The GWAS identified 178 marker-trait associations (MTAs) distributed across all wheat chromosomes. A total of 40 MTAs were positioned within genomic regions of previously discovered end-use quality genes/QTL. Among the identified MTAs, 12 markers had large effects and thus could be considered in the larger scheme of selecting and fixing favorable alleles in breeding for end-use quality in soft white wheat germplasm. We also identified 15 loci (two of them with large effects) that can be used for simultaneous breeding of more than a single end-use quality trait. The results highlight the complex nature of the genetic architecture of end-use quality, and the challenges of simultaneously selecting favorable genotypes for a large number of traits. This study also illustrates that some end-use quality traits were mainly controlled by a larger number of small-effect loci and may be more amenable to alternate selection strategies such as genomic selection. Conclusions In conclusion, a breeder may be faced with the dilemma of balancing genotypic selection in early generation(s) versus costly phenotyping later on. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08676-5.


Introduction
End-use quality improvement in soft white wheat (Triticum aestivum L.) is one of the primary objectives of wheat breeding programs. End-use quality is complex and involves multiple traits. The key end-use quality parameters in soft white wheat include softer kernels, lower grain protein content and gluten strength, less damaged starch, lower non-starch polysaccharides that lead to decreased water absorption capacity, larger cookies diameter and cake volume. For some soft wheat products, starch paste viscosity is a key quality trait.
A number of end-use quality traits are influenced by the effect(s) of major genes. For example, the genetic architecture of grain hardness is primarily controlled by the puroindolines, of gluten strength by the high molecular weight glutenins, and of starch paste viscosity by the granule bound starch synthase ('waxy') genes [1,26]. However, these major genes are often fixed in elite breeding populations due to parent selection, or early generation phenotypic and/or genotypic selection and do not sufficiently account for the levels of end-use quality required for cultivar release nor for the range of variation observed among breeding populations [25].
A number of mapping studies for end-use quality were performed in bi-parental populations and in some cases one or both parents were either poorly adapted or would not constitute 'elite' germplasm for applied plant breeding [9,11,13]. Additionally, the bi-parental genetic structure limits QTL mapping resolution. Genome-wide association mapping (GWAS) can overcome these limitations by using historical recombination events that occur throughout the germplasm evolution and using elite breeding germplasm from the breeding program of interest.. In this study we implemented GWAS using recent breeding lines and cultivars from the Washington State University (WSU) soft white winter wheat breeding program to investigate the underlying genetic architecture of phenotypic variation of 14 end-use quality traits in 672 soft white winter wheat genotypes. We identified enduse quality associated single nucleotide polymorphism (SNP) markers using GWAS and identified large effect QTL. These QTL contribute to better understanding of the underlying genetic architecture of end-use quality in soft white wheat and provide an objective assessment as to the potential for marker assisted selection (MAS) versus other genotypic and phenotypic selection strategies.

Plant materials
A total of 672 soft white winter wheat breeding lines and cultivars were used in this study. The breeding lines were F 4:5 lines and double haploid lines selected from different crosses to represent the diversity present in the WSU winter wheat breeding program. The genotypes and the environments in which the lines were grown were described in Aoun et al. [27]. In brief, this germplasm was evaluated in 29 environments (year-location combinations). Genotypes were grown from 2015 to 2019 in seven locations in Washington State (WA), USA including Pullman, Lind, Davenport, Ritzville, Waterville, Walla Walla, and Dayton. In this dataset, there were 1-7 nurseries per environment with a total of 76 nurseries. From each nursery, a single sample from one replicate per genotype was evaluated for end-use quality traits. The dataset was unbalanced with some shared lines between environments. The connectivity between environments in terms of genotypes was described in Aoun et al. [27]. There were 43 genotypes (out of the 672) evaluated for end-use quality in more than one-quarter of the environments [27].

Phenotypic data
The wheat genotypes were evaluated for 14 end-use quality traits that are classified into four categories which are grain characteristics, milling traits, flour characteristics, and baking parameters. The phenotypic and genotypic data were retrieved from Aoun et al. [27] which investigated genotype × environment interactions and tested the performance of genomic prediction for the 14 enduse quality traits. Traits associated with grain characteristics included Single Kernel Characterization System (SKCS) hardness, SKCS size, SKCS weight, test weight, and grain protein content. SKCS hardness is a key determinant of end-use quality where hard wheat is mainly used for making bread, and soft wheat is primarily used for making cookies, cakes, and confectionery products [1,28,29]. In the grain market, test weight and grain protein content are the two main parameters. High test weight, which is correlated with kernel weight and size [30,31], usually leads to higher milling performance [32].
Milling traits included break flour yield, flour yield, flour ash content, and milling score. Break flour yield was calculated as the percent of flour recovered from the break rolls, whereas flour yield ('straight grade') was determined as the proportion of grain recovered as flour (break plus reduction flour). Flour ash content is the minerals remaining after flour combustion. Milling score was a function of both flour yield and flour ash content [33]. Higher break flour yield, flour yield, and milling score are desirable in soft wheat. Higher inclusion of bran reduces the functionality of most doughs and batters [34]. As such, mineral content of flours (ash) serves as a proxy for bran contamination and lower flour ash is preferred.
Flour functionality plays an important role in baking performance. Flour parameters included flour protein content, flour sodium dodecyl sulfate sedimentation volume (SDS sedimentation), water solvent retention capacity (water SRC), and flour swelling volume (FSV). Unlike bread, soft white wheat products require lower grain/flour protein content, weaker gluten strength (lower SDS sedimentation volume), and low water SRC. FSV is an end-use quality parameter associated with the amount of amylose and amylopectin components in endosperm starch [35] and needs to be high for making some Asian-style noodles [1,36]. The baking parameter cookie diameter is considered an important indicator of the overall quality of soft wheat [28,37] and has been a key selection trait in soft wheat breeding programs.
These end-use quality traits were measured following the procedures from the American Association of Cereal Chemists International [38] and as described by Aoun et al. [27]. The data set was analyzed using mixed linear model in the R package lme4 [39,40]. The environments were considered random, while genotypes were fitted as fixed in the model. For each trait, best linear unbiased estimators (BLUEs) of the genotypes were extracted from the mixed linear model and used for further statistical analysis. Broad sense heritability (H 2 Cullis ) and correlations between traits were previously described by Aoun et al. [27].

Genotyping
Genotyping-by-sequencing (GBS) [41] was used to genotype the 672 soft white wheat breeding lines and cultivars. The genotypic data for the 672 genotypes were previously provided by Aoun et al. [27]. The GBS was performed at the North Carolina State University Genomic Sciences Laboratory in Raleigh, NC, USA. The sequence reads were aligned to the T. aestivum RefSeq v1.0 reference genome [42] and SNP data were filtered for minor allele frequency (MAF) ≥ 5%, missing data ≤ 30%, and heterozygous frequency ≤ 15%. From this, 40,518 SNPs were used for further analysis. Missing datapoints in the SNP data were imputed using the expectation-maximization algorithm implemented in the package rrBLUP [43] in R version 4.0.2 [44].

Population structure and linkage disequilibrium
To visualize the population structure in the 672 genotypes, principal component analysis (PCA) was performed using the 'prcomp' function in R based on 40,518 SNPs. The population structure was visualized using the first two principal components (PCs) that explained the highest percentage of variation. Pairwise linkage disequilibrium (LD) between SNPs (r 2 ) was estimated using TASSEL v5 [45] by applying a sliding window of 50 markers. The r 2 values of marker pairs were plotted against the physical distances in Mega base pairs (Mb) after randomly selecting 10% of the total SNP pairs. To visualize the LD decay across the genome and for each of the 21 chromosomes, a locally estimated scatterplot smoothing (LOESS) curve was fitted using the function 'geom_ smooth' in R package ggplot2 [46]. The r 2 threshold was derived from the 95 th percentile of the distribution of unlinked r 2 (for markers on different chromosomes) [47] that were significant at the 99.99% level of confidence. The r 2 threshold is the value beyond which LD was likely to be caused by genetic linkage. The intersection of the horizontal line at the r 2 threshold value with the LOESS curve on the LD scatter plot was considered as the estimate of the extent of LD across the genome (genomewise LD decay plot) and across each chromosome (chromosome-wise LD decay plot).

Genome-wide association mapping
The BLUEs for each trait were considered as the phenotype in the GWAS. Association mapping was performed using three models 1) mixed linear model (MLM), 2) Fixed and random model Circulating Probability Unification (FarmCPU) [48], and 3) Bayesian-information and Linkage-disequilibrium Iteratively Nested Keyway (BLINK) [49] implemented in the GAPIT R package [50]. The single-locus MLM is the most widely used in association mapping studies. However, it tests one marker at a time and therefore is likely to increase the number of false negatives for complex traits [22,51]. Multi-locus models such as FarmCPU were proposed to overcome this problem. FarmCPU iteratively uses fixed and random models in which the identified significant SNPs from the iterations are fitted as cofactors [48]. FarmCPU was reported to control for false negatives and false positives without causing model overfitting. BLINK was derived from the FarmCPU method with a few modifications. BLINK does not assume that causal genes are evenly distributed across the genome. It also works directly on markers instead of bins and excludes markers in LD with the most significant markers. BLINK uses Bayesian Information Content (BIC) of a fixed effect model to approximate the maximum likelihood of a random effect model to select marker trait associations (MTAs).
The GWAS models considered family relatedness (Kinship matrix or K matrix) [52] and population structure (Q matrix). K matrix was included in all GWAS models, whereas the optimal number of principal components (PCs) in the Q matrix were determined based on quantile-quantile (Q-Q) plots that visualize the expected -log 10 (P) versus the observed -log 10 (P). The number of PCs included in the GWAS models was limited to the first four PCs. Manhattan plots for MTAs were visualized using the R package 'qqman' [53]. MTAs were considered significant at a false discovery rate (FDR) [54] of ≤ 0.05.

Phenotypic data
The distributions of BLUEs for each of the 14 end-use quality traits are illustrated in Fig. 1 For cookie diameter, BLUEs ranged from 7.8 to 9.7 cm. For all of these traits, differences in phenotypes would be considered to be industrially significant, with many values below minimum targets [25]. Moderate to high broad sense heritability (H 2 = 0.46-0.70) was observed for all traits except for grain and flour protein content (H 2 = 0.18 to 0.19) [27].

Population structure and linkage disequilibrium
Of the 40,518 SNPs, there were 14,102 (34.8%) SNPs on the A genome, 16,626 (41.0%) SNPs on the B genome, 8,656 (21.4%) SNPs on the D genome, and 1,134 (2.8%) SNPs on unaligned (UN) chromosome(s). PCA based on the first two PCs showed minimal clustering in wheat genotypes, which was expected since the plant materials in this study were from the same wheat breeding program ( Supplementary Fig. S1). The first 10 PCs accounted cumulatively for 26.3% of the variation. The first four PCs explained 5.3%, 4.0%, 3.3% and 3.0% of variation, respectively. The genome-wise LD dropped to an r 2 threshold of 0.1 within 6.5 Mb on average ( Supplementary Fig. S2). LD decayed to 0.1 at ~ 2.5-5.0 Mb for chromosomes on the A genome, to 5.0-10 Mb for chromosomes on the B and the D genome ( Supplementary Fig. S3).

Genome-wide association mapping GWAS model selection
The best models within each method were selected based on examination of Q-Q plots. For MLM, we selected MTAs from the K (Kinship) model. Using FarmCPU, K + 2PCs (Kinship and Q based on the first two PCs) was selected to model SKCS hardness, SKCS weight, grain protein content, flour yield, flour ash, milling score, SDS sedimentation, water SRC, and cookie diameter, whereas K + 3PCs (Kinship and Q based on the first three PCs) was selected to model the remaining traits, SKCS size, test weight, break flour yield, flour protein content, and FSV. For BLINK, we selected K + 4PCs (Kinship and Q based on the first four PCs) for all traits.
In contrast to the Q-Q plots generated from FarmCPU models, the Q-Q plots from MLM and BLINK did not show a sharp deviation of the observed P-value distribution from the expected P-value distribution (Supplementary Fig. S4, S5, S6). These results suggest that FarmCPU provided a better control of false negatives and false positives compared to MLM and BLINK. Thus, only association mapping results from FarmCPU will be discussed in this study (Tables 1,2 Table S1, S2.

Marker-trait associations
Based on the LD between markers, each MTA identified from the FarmCPU models represent a distinct locus or QTL. Considering all traits together, a total of 178 significant MTAs were identified across all wheat chromosomes (Tables 1,2 Tables 1,2,3,4. There were 12 large-effect markers associated with 11 traits (1-2 markers per trait) ( Table 5). For SKCS size, FSV, and cookie diameter, all significant markers had small effects.  For grain characteristics, five MTAs with large effects were detected on chromosome 1B, 2B, 4B, 5A, and 6B (Table 5). Markers S5A_480515221 and S6B_705613777 were associated with SKCS hardness and impacted the hardness index by 6.7 and 7.9 units on average, respectively. Marker S2B_533178165 was associated with SKCS weight and influenced the phenotype by 3.55 mg. For test weight, S4B_413497949 had the largest effect and resulted in 1.2 kg/hL difference in the phenotype on average, whereas for grain protein content, S1B_46883868 had the largest effect with 0.85% increase/decrease in the phenotype. S1B_46883868 was also associated with flour protein content and affected the trait value by 0.63% on average.
For milling traits, five markers had large effects and were detected on chromosomes 1B, 1D, 5A, and 6B (Table 5). Marker S1B_653681752 was associated with break flour yield and flour yield and influenced trait values by 2.9% and 2.7% on average, respectively. An additional large effect marker was associated with flour yield: S6B_19335996 affected the trait value by 2.5%. Marker S4A_120144412 was associated with flour ash and influenced the phenotype by 0.03% on average. For milling score, S1D_14707739 and S5A_20640566 affected the phenotype by 3.8 and 2.4 units, respectively.
For flour parameters, there were four markers with large effects located on chromosomes 1B, 1D, and 4A (Table 5). In addition to break flour yield, S1B_653681752 also influenced water SRC by 4.9% on average. For SDS sedimentation, two large effect markers were identified including S1D_121990680 and S1D_411063068, which affected the trait values by 1.3 and 1.4 g/mL, respectively. Except for S5A_480515221, S4A_120144412, and S1D_121990680, which were associated with SKCS hardness, flour ash, and SDS sedimentation, the favorable alleles for the remaining nine large effect markers were present in high frequencies (86-93%) in this soft white wheat germplasm.

Loci associated with at least two end-use quality traits
Among the 178 MTAs identified in this study, there were 17 loci associated with more than a single end-use quality trait (Table 6). Among these loci, there were two largeeffect markers (S1B_46883868 and S1B_653681752) and ten small-effect markers that were associated with at least two end-use quality traits. For each of these 12 markers, there was desirable linkage between the favorable alleles. This suggests that these markers are having desirable pleiotropic effects and could be useful to simultaneously breed for more than a single end-use quality trait.   markers, there were additionally five loci on chromosomes 1A, 1B, 6B, 7A, and 7B that were associated with more than a single end-use quality trait (Table 6). For each of these loci, LD between significant markers that were associated with different traits was higher than the r 2 threshold of 0.1. For three of these loci, S1A_586706 397/S1A_587581129, S6B_27918221/ S6B_29771821 and S7A_730416426/S7A_731026067, there was desirable linkage between marker alleles in each locus, whereas for the other two loci S1B_561712520/S1B_569507932 and S7B_624947199/S7B_636744313, there was unfavorable linkage. Therefore, for the latter two loci, selecting for one trait could negatively affect the other trait.

Co-localized MTAs with previously identified end-use quality genes/QTL
A total of 35 annotated genes were located close to the physical positions of the 12 large effect markers. The putative functions of these genes are described in Supplementary Table S3. In addition, we found 13 GWAS MTAs available at the IWGSC sequence repository that were within the genomic regions of the 12 large effect markers identified in our study. These GWAS MTAs from previous studies were associated with thousand kernel weight, test weight, grain fill duration, grain protein content, SDS sedimentation, and grain minerals (Cu and Zn) (Supplementary Table S3). Furthermore, comparative mapping (based on physical positions of molecular markers) between all the 178 identified MTAs from this study and end-use quality QTL/genes from previous genetic studies [2,4,[6][7][8][9][10][11][12][13][14][15][16][17][19][20][21][22] showed that 40 MTAs were positioned within genomic regions of previously discovered end-use quality genes/QTL (Supplementary Table S4).
Four of the associated loci with grain protein content in this study were previously reported (Supplementary Table  S4). This includes S1B_633974958, which was positioned   , which were associated with grain protein content and wet gluten content, respectively [7]. Similarly, S4B_63121316 and S5D_543253602 were found close to associated markers with grain protein content gwm368 (60 Mb, [11]) and wPt-9788/wPt-0400 (560 Mb, [12]), respectively. On chromosome 7A, the grain protein content associated marker, S7A_731026067 was 18 Mb from Qgpc.7A.1 (wmc525, [19]), which was associated with both protein content and dry gluten content. For break flour yield, the associated locus tagged by marker S3B_630394456 was at the physical position of IWA6254 (630 Mb) also associated with break flour yield [17]. For flour yield, five of the identified MTAs were mapped close to flour yield associated loci in previous genetic studies (Supplementary Table S4). For instance, S1A_9128313 and S1A_15686346 were in close proximity to QFY.ksu-1A (7 Mb, [21]). S1A_9128313 and S1A_15686346 were also within the genomic regions of the genes TraesCS1A01G010900 (5 Mb) and TraesCS1A01G039600 (16 Mb), respectively. TraesC-S1A01G010900 (5 Mb) and TraesCS1A01G039600 were annotated as low molecular weight glutenin subunit and high molecular weight glutenin subunit, respectively. Another MTA on chromosome 1B associated with flour yield, S1B_653681752, was identified close to QFY. ksu-1B (649 Mb) [21]. Similarly, S5A_382294123 and S6B_19335996 were found in proximity to IWB76667 (384 Mb, [2]) and IWA7725 (27 Mb, [17]), respectively.
For flour protein, S2D_28763299 was located 15-17 Mb from QGpc.caas-2D [7], which was associated with grain protein content and QWgc. WY-2D.5 [8], which was associated with wet gluten content. Similarly, S7A_730416426 associated with flour protein content in this study was ~ 17 Mb away from qGPC.7A.1 [19], which was associated with grain protein content and dry gluten content (Supplementary Table S4).

Discussion
This study used historical data that captured a wide range of phenotypic variation for end-use quality within a soft white winter wheat breeding program. Heritability   Table 6 Loci associated with two or more end-use quality traits in 672 soft white winter wheat genotypes a Markers in bold had large effects. Markers before and after the forward slash (/) were in linkage disequilibrium (r 2 ≥ 0.1). S1A_586706397 and S1A_587581129 were associated with cookie diameter and flour yield, respectively, S1B_561712520 and S1B_569507932 were associated with SDS sedimentation and SKCS size, respectively, S6B_27918221 and S6B_29771821 were associated with milling score and water SRC, respectively. S7A_730416426 and S7A_731026067 were associated with flour protein and grain protein content, respectively, and S7B_624947199 and S7B_636744313 were associated with flour ash and test weight, respectively

S1A_586706397/ S1A_587581129
× × S1B_46883868 × × S1B_561712520/ S1B_569507932 estimates for end-use quality traits were moderate to high except for grain and flour protein content. This suggests that most traits are primarily controlled by genetic factors and that a genotypic selection (as opposed to phenotypic selection) is a rational strategy. This study identified the genetic architecture underlying 14 end-use quality traits among recent breeding lines and cultivars from a soft white winter wheat breeding program. Prior to this study, Jernigan et al. [2] investigated the genetic architecture of end use quality in a set of 480 advanced soft white winter wheat breeding lines and cultivars from Pacific Northwest breeding programs selected from 1992 to 2014. Thus, the germplasm used in this study is different from that used by Jernigan et al. [2]. Consequently, our investigation was expected to corroborate previous and/or discover additional QTL associated with end-use quality in soft white wheat. Identified MTAs in this study as well as genotypes with favorable alleles will be useful for end-use quality improvement in soft white and other types of wheat. The 12 large effect markers can be converted into Kompetitive Allele Specific PCR (KASP) or thermal asymmetric reverse PCR (STARP) markers for use in marker-assisted selection (MAS). Among these large effect markers, S1B_653681752 is useful to breed for higher break flour yield and flour yield and lower water SRC. Similarly, S1B_46883868 is associated with both grain protein content and flour protein. The favorable alleles of nine of the large effect markers were present in high frequencies in this germplasm. This suggested that these markers were under high selection pressure in the soft white wheat breeding program, likely the result of long-term phenotyping and selection, and the pyramiding of favorable alleles across the breeding populations. Based on our comparative mapping, eight of the large effect markers including S1B_46883868, S1D_14707739, S1D_121990680, S1D_411063068, S2B_533178165, S4A_120144412, S4B_413497949, and S5A_20640566 were not reported in previous studies, thus should be prioritized for MAS. Only a few loci were found to have large effects, suggesting that many end-use quality traits have complex genetic architecture and are mainly controlled by several minor genes with small effects. For some traits like SKCS diameter, FSV, and cookie diameter, all identified markers had small effects, suggesting that MAS may not be useful for these traits. Therefore, genomic selection might be a better approach to implement for such traits [27,55].

Grain characteristics
Grain characteristics greatly influence wheat end-use quality [4,7,11,30,47]. Grain hardness affects most end-use quality traits including break flour yield, flour yield, flour particle size, starch damage, dough strength, and cookie diameter [27,[56][57][58]. The variation in grain hardness in the present soft wheat germplasm, like most soft wheat breeding populations, is independent of the puroindolines because wild-type puroindoline genes at the Ha locus are generally fixed. This is consistent that no MTAs were identified on chromosome 5DS in this study. Other grain characteristics including SKCS size, SKCS weight, test weight, and grain protein influence wheat milling performance [28,30]. SKCS size and SKCS weight were highly correlated in this germplasm (r = 0.8; [27]) and this was reflected in the GWAS in which S2D_563799166 and S6B_583281710 were found to be associated with both traits.
Grain protein content is an essential quality trait that affects flour functionality. Unlike bread, soft wheat products often require lower protein levels to minimize gluten formation and mixing strength [5]. The positive correlation (r = 0.4-05; [27]) between grain/flour protein content and SDS sedimentation (a measure of gluten strength) in this germplasm provides further evidence of their direct relationship. However, based on the GWAS, no significant markers were in common between SDS sedimentation and grain/flour protein content. Grain and flour protein were phenotypically correlated in this germplasm [27]. This relationship was also evident in our GWAS in which five markers were associated with both grain and flour protein. Grain and flour protein in this wheat collection had low heritability estimates and high genotype by environment interactions as described by Aoun et al. [27]. Consequently, most markers associated with grain/flour protein in this study had small effects, except for marker S1B_46883868.

Milling traits
Higher break flour yield, flour yield, lower flour ash, and higher milling score are desirable traits in soft wheat. Cultivars with alleles that increase these traits could lead to higher milling performance and thus greater profit for flour millers. Moderate to high heritability estimates and positive correlations among milling traits in this germplasm [27] suggest that genetic gain and simultaneous breeding for these traits is possible. Positive correlations between milling traits were also obvious in our GWAS results. For instance, S1B_653681752 and S5B_508665777 favorable alleles for break flour yield were also associated with higher flour yield. Similarly, S6D_471614981, a favorable allele for flour yield was also associated with higher milling score. Negative correlations between milling score and ash in this germplasm (r = -0.7) were discussed in Aoun et al. [27]. This desirable negative correlation was also reflected in our GWAS in which the S5B_68052478 minor allele was associated with lower ash and higher milling score. We found that S1B_100055026, which was associated with break flour yield, was located close to Glu-B3 gene flanked by the DArT marker wPt-1317 (137 Mb, [14]). Similarly, the flour yield associated marker in this study, S1B_555294134, was located 1 Mb from Glu-B1 (556 Mb). It is well known that glutenin subunit families are major components of wheat endosperm storage proteins and are associated with many end-use quality traits. The presence of break flour yield and flour yield associated loci close to Glu-B1 and Glu-B3 may suggest that there is a genetic association between endosperm storage proteins and endosperm structure as evidenced by Boehm Jr et al. [59]. The composition of the protein matrix surrounding starch granules likely contributes to the mechanical strength of the endosperm.

Flour and baking parameters
Unlike bread, confectionary products require lower gluten strength and water absorption capacity, which were measured using SDS sedimentation and water SRC, respectively. Higher water SRC is in part due to starch damage from milling and non-starch polysaccharides [5,33,60] and thus, lower water absorption is preferred as it results in better cookie spread and lower viscosity batters. Three water SRC associated markers co-localized with milling trait associated markers including S1B_653681752, S5A_382294123, and S6B_27918221/ S6B_29771821. Negative correlations between water SRC and milling traits previously discussed by Aoun et al. [27] were also observed in our GWAS results particularly for markers S1B_653681752 and S5A_382294123.
Higher FSV is desirable for making some Asian-style noodles [1,36]. We found that S1A_534055653, which was associated with FSV in our study was near the gene Glu-A1 flanked by the SSR marker wmc312 (511 Mb, [14]). This result suggests genetic correlation between gluten content/strength and FSV. Similar observation was also found for cookie diameter in which its associated marker S1B_573323546 was close to the position of the gene Glu-B1. The FSV associated marker S7D_38000037 from this study was 2 Mb from the waxy locus Wx-D1. The association between S7D_38000037 and any null allele at Wx-D1 is at present unknown, but is unlikely as the known Waxy allele at Wx-D1 is rare [61]. Similarly, we did not identify MTAs for FSV that were close to the locations of the other homoeologous waxy loci Wx-A1 and Wx-B1 which were located on chromosome 7A and 4A, respectively [35,62]. Mutation/deletion in any of the three waxy loci often results in reduced amylose 'partial waxy' wheat which is associated with higher FSV. Therefore, the variation in FSV in this germplasm is likely independent of the waxy loci. As noted above, there were no major QTL identified for cookie baking. As such, alternative genotypic selection strategies such as genomic selection may be more appropriate for this trait.

Conclusion
In this study we investigated the phenotypic and genotypic structure of 14 end-use quality traits in 672 soft white winter wheat breeding lines and cultivars adapted to the Pacific Northwest region of the United States. A total of 178 MTAs were identified across all wheat chromosomes of which 40 MTAs were positioned within genomic regions of previously discovered end-use quality genes/QTL. These results highlight the fact that among the multitude of traits that a wheat breeder selects for, end-use quality is a relatively large proportion. The high heritability of most traits underscores the success of long-term phenotypic selection. Among the identified MTAs, 12 markers had large effects (eight of them were previously uncharacterized) and thus could be prioritized in breeding programs. For example, a relatively manageable number of lines, say, those resulting from head row selection, could be subjected to a single round of genotypic selection to fix the favorable allele at one or more of the large effect loci. Such a strategy could return benefits later on as a greater proportion of lines would meet enduse quality targets during subsequent replicated yield trials. This study also revealed that for some end-use quality traits (SKCS size, FSV, and cookie diameter), only small effect markers were identified, suggesting that these traits are controlled by multiple minor genes in this germplasm, and that alternative selection strategies such as genomic selection could augment traditional and laborious phenotyping.